The Ubuntu Chat Corpus for Multiparticipant Chat Analysis
نویسندگان
چکیده
We present the Ubuntu Chat Corpus as a data source for multiparticipant chat analysis. This addresses the problem of the lack of a large, publicly suitable corpora for research in this medium. The advantages of using this corpus for research is its large number of chat messages, its multiple languages, its technical nature, and all of the original chat messages are in the public domain.
منابع مشابه
Extending Word Highlighting in Multiparticipant Chat
We describe initial work on extensions to word highlighting for multiparticipant chat to aid users in finding messages of interest, especially during times of high traffic in chat rooms. We have annotated a corpus of chat messages from a technical chat domain (Ubuntu’s technical support), indicating whether they are related to Ubuntu’s new desktop environment Unity. We also created an unsupervi...
متن کاملDetecting Bot-Answerable Questions in Ubuntu Chat
Ubuntu’s Internet Relay Chat technical support channel has bots that output specific messages in response to command words from other channel users. These messages can be used to answer frequently-asked questions instead of requiring an expert to (repeatedly) type a lengthy reply. We describe an approach to automatically distinguish bot-answerable questions, which would mitigate this problem. T...
متن کاملUbuntu-fr: a Large and Open Corpus for Supporting Multi-Modality and Online Written Conversation Studies
We present a large, free, French corpus of online written conversations extracted from the Ubuntu platform’s forums, mailing lists and IRC channels. The corpus is meant to support multi-modality and diachronic studies of online written conversations. We choose to build the corpus around a robust metadata model based upon strong principles, such as the "stand off" annotation principle. We detail...
متن کاملMultiparticipant chat analysis: A survey
a r t i c l e i n f o a b s t r a c t We survey research on the analysis of multiparticipant chat. Multiple research and applied communities (e.g., AI, educational, law enforcement, military) have interest in this topic. After introducing some context, we describe relevant problems and how these have been addressed using AI techniques. We also identify recent research trends and unresolved issu...
متن کاملHealthcare Priority-Setting: Chat-Ting Is Not Enough; Comment on “Swiss-CHAT: Citizens Discuss Priorities for Swiss Health Insurance Coverage”
CHAT has its limits. It is a three-hour exercise. However, the real world problems of healthcare rationing and priority-setting are too complex for a three-hour exercise. What is needed, as a supplement, are sustained processes of rational democratic deliberation that can address the challenges to healthcare justice posed by costly emerging medical technologies, such as these targeted cancer th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013